-
Notifications
You must be signed in to change notification settings - Fork 152
Use the one flatbuffer to store all lists #489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
pub(crate) fn filter_list(&self) -> fb::NetworkFilterList<'_> { | ||
unsafe { fb::root_as_network_filter_list_unchecked(self.data()) } | ||
pub(crate) fn root(&self) -> fb::Engine<'_> { | ||
unsafe { fb::root_as_engine_unchecked(self.data()) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected 'unsafe' usage, please audit for secure usage
Source: https://semgrep.dev/r/rust.lang.security.unsafe-usage.unsafe-usage
Cc @thypon @kdenhartog
flatbuffers::Vector<'a, flatbuffers::ForwardsUOffset<NetworkFilterList>>, | ||
>>(Engine::VT_LISTS, None) | ||
.unwrap() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected 'unsafe' usage, please audit for secure usage
Source: https://semgrep.dev/r/rust.lang.security.unsafe-usage.unsafe-usage
Cc @thypon @kdenhartog
None, | ||
) | ||
.unwrap() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected 'unsafe' usage, please audit for secure usage
Source: https://semgrep.dev/r/rust.lang.security.unsafe-usage.unsafe-usage
Cc @thypon @kdenhartog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust Benchmark
Benchmark suite | Current: 80eb8f6 | Previous: 4738d3f | Ratio |
---|---|---|---|
rule-match-browserlike/brave-list |
2244879462 ns/iter (± 15302810 ) |
2252126465 ns/iter (± 13369021 ) |
1.00 |
rule-match-first-request/brave-list |
992709 ns/iter (± 6106 ) |
1000284 ns/iter (± 8364 ) |
0.99 |
blocker_new/brave-list |
148347662 ns/iter (± 1114694 ) |
150674514 ns/iter (± 1975624 ) |
0.98 |
blocker_new/brave-list-deserialize |
63908754 ns/iter (± 1059139 ) |
62360204 ns/iter (± 1865060 ) |
1.02 |
memory-usage/brave-list-initial |
16282069 ns/iter (± 3 ) |
16225933 ns/iter (± 3 ) |
1.00 |
memory-usage/brave-list-initial/max |
64817658 ns/iter (± 3 ) |
64817658 ns/iter (± 3 ) |
1 |
memory-usage/brave-list-initial/alloc-count |
1514486 ns/iter (± 3 ) |
1514650 ns/iter (± 3 ) |
1.00 |
memory-usage/brave-list-1000-requests |
2516503 ns/iter (± 3 ) |
2505592 ns/iter (± 3 ) |
1.00 |
memory-usage/brave-list-1000-requests/alloc-count |
66588 ns/iter (± 3 ) |
66070 ns/iter (± 3 ) |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
f8e5204
to
27de614
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked over the unsafe usage and this matches what it was previously. We changed the return type here, but it's not a concern and the implementation matches what it previously was. I'm removing those alerts
src/filters/fb_network.rs
Outdated
// TODO: do we need another feature for this? | ||
#[cfg(feature = "unsync-regex-caching")] | ||
pub(crate) type SharedStateRef = std::rc::Rc<SharedState>; | ||
#[cfg(not(feature = "unsync-regex-caching"))] | ||
pub(crate) type SharedStateRef = std::rc::Arc<SharedState>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same feature for both is fine, but it would be nice to rename it since it'd no longer be strictly about regex caching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that note though, I don't think we actually need any refcounted pointers for this? We could pass the &SharedState
in as an argument to whatever functions require it, or give Blocker
a <'a>
lifetime so it can own a shared_state: &'a SharedState
field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds logical, but the issue is that Engine
(the primary owner of shared_state
) also own a Blocker instance (that also stores shared_state
).
A class member cannot use the same lifetime as another class member without stuff like https://docs.rs/ouroboros/latest/ouroboros/attr.self_referencing.html
Without using Rc
Blocker will became Blocker<'a>
with a limited lifetime and can't be stored as a member of Engine, which results in major changes in the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the feature renamed to single-thread
[puLL-Merge] - brave/adblock-rust@489 DescriptionThis PR performs a major architectural refactoring of the adblock-rust library, changing from multiple specialized filter lists to a single unified flatbuffer-based storage system. The key changes include:
The motivation appears to be reducing memory overhead and improving performance by using a single flatbuffer instead of multiple separate filter lists. Possible Issues
Security Hotspots
ChangesChangesCargo.toml
src/blocker.rs
src/engine.rs
src/filters/flat_builder.rs (new file)
src/filters/fb_network.rs
src/data_format/storage.rs
Benchmark files
sequenceDiagram
participant Client
participant Engine
participant FlatBufferBuilder
participant FilterDataContext
participant Blocker
Client->>Engine: from_filter_set(filters, optimize)
Engine->>FlatBufferBuilder: make_flatbuffer(network_filters, optimize)
FlatBufferBuilder->>FlatBufferBuilder: categorize filters by type
FlatBufferBuilder->>FlatBufferBuilder: build unified flatbuffer
FlatBufferBuilder-->>Engine: VerifiedFlatbufferMemory
Engine->>FilterDataContext: new(memory)
FilterDataContext-->>Engine: FilterDataContextRef
Engine->>Blocker: from_context(filter_data_context)
Blocker-->>Engine: Blocker instance
Engine-->>Client: Engine instance
Client->>Engine: check_network_request(request)
Engine->>Blocker: check(request, resources)
Blocker->>Blocker: get_list(filter_type)
Blocker->>FilterDataContext: access filter data
FilterDataContext-->>Blocker: NetworkFilterList
Blocker-->>Engine: BlockerResult
Engine-->>Client: CheckResult
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Rust Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10
.
Benchmark suite | Current: c0763c2 | Previous: 4738d3f | Ratio |
---|---|---|---|
blocker_new/brave-list-deserialize |
71283861 ns/iter (± 2428053 ) |
62360204 ns/iter (± 1865060 ) |
1.14 |
This comment was automatically generated by workflow using github-action-benchmark.
@@ -0,0 +1,337 @@ | |||
//! Builder for creating flatbuffer with serialized engine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be rename it to fb_builder.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can. @antonok-edm WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm asking because we have fb_network.rs
and flat_filter_map.rs
and I suppose that fb for flatbuffers
and flat
for flat containers
The PR moves from per-NetworkList flatbuffers to a one (per-Engine).
It doesn't affect the performance metrics, but open possibilities to put cosmetic filters to the same flatbuffer.
It also simplifies the serialization and deserialization code.
Notes:
insert_dup
was dropped, currently it didn't have any effect with flatbuffers (we compared the flatbuffers offsets, not the unique ids). Maybe it makes sense to restore it the future versions.NetworkFilterLists::new
is left as-is for benches and tests to avoid excessive diff.